emotional speech recognition and emotion identification in farsi language
نویسندگان
چکیده
speech emotion can add more information to speech in comparison to available textual information. however, it will also lead to some problems in speech recognition process. in a previous study, we depicted the substantial changes of speech parameters caused by speech emotion. therefore, in order to improve emotional speech recognition rate, in a first step, the effects of emotion on speech parameters should be evaluated and in the next steps, emotional speech recognition accuracy be improved through application of suitable parameters. the changes in speech parameters, i.e. formant frequencies and pitch frequency, due to anger and grief were evaluated for farsi language in our former research. in this research, using those results, we try to improve emotional speech recognition accuracy using baseline models. we show that adding parameters such as formant and pitch frequencies to the speech feature vector can improve recognition accuracy. the amount of improvement depends on parameter type, number of mixture components and the emotional condition. proper identification of emotional condition can also help in improving speech recognition accuracy. to recognize emotional condition of speech, formant and pitch frequencies were used successfully in two different approaches, namley decision tree and gmm.
منابع مشابه
Recognition of Emotional Speech and Speech Emotion in Farsi
Speech emotion can add extra information to speech in comparison with available textual information. However, it can also lead to some problems in the automatic speech recognition process. We evaluated the changes in speech parameters, i.e. formant frequencies and pitch frequency, due to anger and grief for Farsi language in a former research. Here, using those results, we try to improve emotio...
متن کاملEnhancing Multilingual Recognition of Emotion in Speech by Language Identification
We investigate, for the first time, if applying model selection based on automatic language identification (LID) can improve multilingual recognition of emotion in speech. Six emotional speech corpora from three language families (Germanic, Romance, Sino-Tibetan) are evaluated. The emotions are represented by the quadrants in the arousal/valence plane, i. e., positive/negative arousal/valence. ...
متن کاملSpeech emotion recognition in emotional feedback for Human-Robot Interaction
For robots to plan their actions autonomously and interact with people, recognizing human emotions is crucial. For most humans nonverbal cues such as pitch, loudness, spectrum, speech rate are efficient carriers of emotions. The features of the sound of a spoken voice probably contains crucial information on the emotional state of the speaker, within this framework, a machine might use such pro...
متن کاملData Pre-processing in Emotional Speech Synthesis by Emotion Recognition
Synthesizing emotional speech by means of conversion from neutral speech allows us to generate emotional speech from many existing Text-to-Speech (TTS) systems. How much of the target emotion can be portrayed by the generated speech is largely dependent on the emotion data used to train the mapping function for voice transformation. In this paper, we introduce a method to pre-process the emotio...
متن کاملEmotion attribute projection for speaker recognition on emotional speech
Emotion is one of the important factors that cause the system performance degradation. By analyzing the similarity between channel effect and emotion effect on speaker recognition, an emotion compensation method called emotion attribute projection (EAP) is proposed to alleviate the intraspeaker emotion variability. The use of this method has achieved an equal error rate (EER) reduction of 11.7%...
متن کاملEmotion Identification for Evaluation of Synthesized Emotional Speech
In this paper, we propose to evaluate the quality of emotional speech synthesis by means of an automatic emotion identification system. We test this approach using five different parametric speech synthesis systems, ranging from plain non-emotional synthesis to full re-synthesis of pre-recorded speech. We compare the results achieved with the automatic system to those of human perception tests....
متن کاملمنابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
the modares journal of electrical engineeringناشر: tarbiat modares university
ISSN 2228-527 X
دوره 8
شماره 1 2008
کلمات کلیدی
میزبانی شده توسط پلتفرم ابری doprax.com
copyright © 2015-2023